Self-Organizing Machine Translation: Example-Driven Induction of Transfer Functions
نویسنده
چکیده
Come, let us go down and there make such a babble of their language that they will not understand another's speech. { Genesis 11:7 With the advent of faster computers, the notion of doing machine translation from a huge stored database of translation examples is no longer unreasonable. This paper describes an attempt to merge the Example-Based Machine Translation (EBMT) approach with psycholinguistic principles. A new formalism for context-free grammars, called marker-normal form, is demonstrated and used to describe language data in a way compatible with psycholinguistic theories. By embedding this formalism in a standard multivariate optimization framework, a system can be built that infers correct transfer functions for a set of bilingual sentence pairs and then uses those functions to translate novel sentences. The validity of this line of reasoning has been tested in the development of a system called METLA-1. This system has been used to infer English!French and English!Urdu transfer functions from small corpora. The results of those experiments are examined, both in engineering terms as well as in more linguistic terms. In general, the results of these experiments were psychologically and linguistically well-grounded while still achieving a respectable level of success when compared against a similar prototype using Hidden Markov Models.
منابع مشابه
An Example-Based Method for Transfer-Driven Machine Translation
This paper presents a method called Transfer-Driven Machine Translation (TDMT), which utilizes an example-based framework for various process and combines multi-level knowledge. An example-based framework can achieve quick processing and consistently describe knowledge. It is useful for spoken-language translation, which needs robust and efficient translation. TDMT strengthens the example-based...
متن کاملTransfer-Driven Machine Translation
Transfer-Driven Machine Translation (TDMT) [1, 2] is a translation technique developed as a research project at ATR Interpreting Telecommunications Research Laboratories. In TDMT, translation is performed mainly by a transfer module which applies transfer knowledge to an input sentence. Other modules, such as lexical processing, analysis, contextual processing and generation, cooperate with the...
متن کاملAutomatic Rule Induction in Arabic to English Machine Translation Framework
This chapter addresses the exploitation of a supervised machine learning technique to automatically induce Arabic-to-English transfer rules from chunks of parallel aligned linguistic resources. The induced structural transfer rules encode the linguistic translation knowledge for converting an Arabic syntactic structure into a target English syntactic structure. These rules are going to be an in...
متن کاملAdaptive Translation: Finding Interlingual Mappings Using Self-Organizing Maps
This paper presents a method for creating interlingual wordto-word or phrase-to-phrase mappings between any two languages using the self-organizing map algorithm. The method can be used as a component in a statistical machine translation system. The conceptual space created by the self-organizing map serves as a kind of interlingual representation. The specific problems of machine translation a...
متن کاملInducing Translation Templates for Example-Based Machine Translation
This paper describes an example-based machine translation (EBMT) system which relays on various knowledge resources. Morphologic analyses abstract the surface forms of the languages to be translated. A shallow syntactic rule formalism is used to percolate features in derivation trees. Translation examples serve the decomposition of the text to be translated and determine the transfer of lexical...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/cmp-lg/9406012 شماره
صفحات -
تاریخ انتشار 1994